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! ! 'Tis a lesson you should heed: 

^) Try, try, try again. 

Q // at first you don't succeed, 

Try, try, try again. 

(William Edward Hickson, 19th century educational writer) 

Abstract 

Let G — (V, E) be an n-vertex graph and Md a d-vertex graph, for some constant d. Is Md 
lf~) a subgraph of G? We consider this problem in a model where all n processes are connected 

^^•O to all other processes, and each message contains up to O(logn) bits. A simple deterministic 

algorithm that requires 0(n t - d ~~ 2 ^ d / logn) communication rounds is presented. For the spe- 
cial case that Md is a triangle, we present a probabilistic algorithm that requires an expected 
C([n 1 / 3 /(t 2 / 3 + 1)]) rounds of communication, where t is the number of triangles in the graph, 
and ©(mini?! 1 / 3 log 2/3 n/(i 2 / 3 + l),n 1/3 }) with high probability. 
• • We also present deterministic algorithms specially suited for sparse graphs. In any graph of 

. ^ maximum degree A, we can test for arbitrary subgraphs of diameter D in 0(\A D+1 /n~\ ) rounds. 

For triangles, we devise an algorithm featuring a round complexity of 0{A 2 jn + log 2+n /j^2 n), 
5— i where A denotes the arboricity of G. 



1 Introduction 

In distributed computing, it is common to represent a distributed system as a graph whose nodes 
are computational devices (or, more generally, any kind of agents) and whose edges indicate which 
pairs of devices can directly communicate with each other. Since its infancy, the area has been 
arduously studying the so-called local model (cf. [15]), where the devices try to jointly compute 
some combinatorial structure, such as a maximal matching or a node coloring, of this communication 
graph. In its most pure form, the local model is concerned with one parameter only: the locality of 
a problem, i.e., the number of hops up to which nodes need to learn the topology and local portions 
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of the input in order to compute their local parts of the output — for example this could be whether 
or not an outgoing edge is in the maximal matching or the color of the node. 

Considerable efforts have been made to understand the effect of bounding the amount of com- 
munication across each edge. In particular, the congest model that demands that in each time 
unit, at most 0(logn) bits are exchanged over each edge, has been studied intensively. However, to 
the best of our knowledge, all known lower bounds rely on "bottlenecks" [9, 11, 16], i.e., small edge 
cuts that severely constrain the total number of bits that may be communicated between different 
parts of the graph. In contrast, very little is known about the possibilities and limitations in case 
the communication graph is a clique, i.e., the communication bounds are symmetric and indepen- 
dent of the structure of the problem we need to solve. The few existing works show that, as one can 
expect, such a distributed system model is very powerful: A minimum spanning tree can be found 
in O(loglogn) time [12], with randomization nodes can send and receive up to 0(n) messages of 
size 0(logn) in 0(1) rounds, without any initial knowledge of which nodes hold messages for which 
destinations [10], and, using the latter routine, they can sort n 2 keys in 0(1) rounds (where each 
node holds n keys and needs to learn their index in the sorted sequence) [14]. In general, none of 
these tasks can be performed fast in the local model, as the communication graph might have a 
large diameter. 

In the current paper, we examine a question that appears to be hard even in a clique if message 
size is constrained to be 0(logn). Given that each node initially knows its neighborhood in an 
input graph, the goal is to decide whether this graph contains some subgraph on d 6 0(1) vertices. 
In the local model, this can be trivially solved by each node learning the topology up to a constant 
distance; 1 in our setting, this simple strategy might result in a running time of Q.{n/ logn), as some 
(or all) nodes may have to learn about the entire graph and thus need to receive Q(n 2 ) bits. We 
devise a number of algorithms that achieve much better running times. These algorithms illustrate 
that efficient algorithms in the contemplated model need to strive for balancing the communication 
load, and we show some basic strategies to do so. We will corollary that it is possible for 

all nodes to learn about the entire graph within 0([|i?|/n]) rounds and therefore locally solve any 
(computable) problem on the graph; this refines the immediately obvious statement that the same 
can be accomplished within A (where A denotes the maximum degree of G) rounds by each node 
sending its complete list of neighbors to all other nodes. For various settings, we achieve running 
times of o(\E\/n) by truly distributed algorithms that do not require that (some) nodes obtain full 
information on the entire input. 

Apart from shedding more light on the power of the considered model, the detection of small 
subgraphs, sometimes referred to as graphlets or network motifs, is of interest in its own right. 
Recently, this topic received growing attention due to the importance of recurring patterns in 
man-made networks as well as natural ones. Certain subgraphs were found to be associated with 
neurobiological networks, others with biochemical ones, and others still with human-engineered 
networks [13]. Detecting network motifs is an important part of understanding biological networks, 
for instance, as they play a key role in information processing mechanisms of biological regulation 
networks. Even motifs as simple as triangles are of interest to the biological research community as 
they appear in gene regulation networks, where what a graph theorist would call a directed triangle 
is often referred to as a Feed-Forward Loop. In recent years, the network motifs approach to 
studying networks lead to development of dedicated algorithms and software tools. Being of highly 

1 In the local model, one is satisfied with at least one node detecting a respective subgraph. Requiring that the 
output is known by all nodes results in the diameter being a trivial lower bound for any meaningful problem. 
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applicative nature, algorithms used in such context are usually researched from an experimental 
point of view, using naturally generated data sets [8]. 

Triangles and triangle- free graphs also play a central role in combinatorics. For example, planar 
triangle- free graphs are long since known to be 3-colorable [7]. The implications of triangle finding 
and triangle-freeness motivated extensive research of algorithms, as well as lower bounds, in the 
centralized model. Most of the work done on these problems falls into one of two categories: 
subgraph listing and property testing. In subgraph listing, the aim is to list all copies of a given 
subgraph. The number of copies in the graph, that may be as high as 0(n 3 ) for triangles, sets 
an obvious lower bound for the running time of such algorithms, rendering instances with many 
triangles harder in some sense [4]. Property testing algorithms, on the other hand, distinguish 
with some probability between graphs that are triangle-free and graphs that are far from being 
triangle- free, in the sense that a constant fraction of the edges has to be removed in order for 
the graph to become triangle- free [1, 2]. Although soundly motivated by stability arguments, the 
notion of measuring the distance from triangle-freeness by the minimal number of edges that need 
to be removed seems less natural than counting the number of triangles in the graph. Consider 
for instance the case of a graph with n nodes comprised of n — 2 triangles, all sharing the same 
edge. From the property testing point of view, this graph is very close to being triangle free, 
although it contains a linear number of triangles. Some query-based algorithms were suggested in 
the centralized model, where the parameter to determine is the number of triangles in the graph. 
The lower bounds for such algorithms assume restrictions on the type of queries 2 that cannot be 
justified in our model [6]. 

Detailed Contributions. In Section 3, we start out by giving a family of deterministic 
algorithms that decide whether the graph contains a d-vertex subgraph within 0(n^ d ~ 2 ^ d ) rounds. 
In fact, these algorithms find all copies of this subgraph and therefore could be used to count 
the exact number of occurrences. They split the task among the nodes such that each node 
is responsible for checking an equal number of subsets of d vertices for being the vertices of a 
copy of the targeted subgraph. This partition of the problem is chosen independently of the 
structure of the graph. Note that even the trivial algorithm that lets each node collect its -D-hop 
neighborhood and test it for instances of the subgraph in question does not satisfy this property. 
Still it exhibits a structure that is simple enough to permit a deterministic implementation of 
running time 0(\A D+1 /n]), where A is the maximum degree of the graph, given in Section 4. For 
the special case of triangles, we present a more intricate way of checking neighborhoods that results 
in a running time of 0(A 2 /n + \og2+ n /A 2 n) C 0(|i^|/n + logn), where the arboricity A of the graph 
denotes the minimal number of forests into which the edge set can be decomposed. While always 
A < A, it is possible that A £ 0(1), yet A £ G(n) (e.g. in a graph that is a star). Moreover, any 
family of graphs excluding a fixed minor has A E 0(1) [5], demonstrating that the arboricity is a 
much less restrictive parameter than A. Note also that the running time bound in terms of \E\ is 
considerably weaker than the one in terms of A; it serves to demonstrate that in the worst case, 
the algorithm's running time essentially does not deteriorate beyond the trivial 0(\E\/n) bound. 

All our deterministic algorithms systematically check for subgraphs by either considering all 
possible combinations of d nodes or following the edges of the graph. If there are many copies of 
the subgraph available, it can be much more efficient to randomly inspect small portions of the 
graph. In Section 5, we present a triangle-finding algorithm that does just that, yielding that for 
every e > 1/n and a graph containing t > 1 triangles, a triangle will be found with probability at 

2 For instance, in [6] the query model requires that edges are sampled uniformly at random. 
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least 1 — e within 0((ra 1//3 log 2 / 3 e _1 )/i 2 / 3 + logn) rounds; we show this analysis to be tight. 

All our algorithms are uniform, i.e., they require no prior knowledge of parameters such as t or 
A. Interleaving them will result in an asymptotic running time that is bounded by the minimum of 
all the individual results. All proofs are omitted from this extended abstract due to lack of space, 
and are detailed in full in the appendix. 

2 Model and Problem 

Our model separates the computational problem from the communication model. The set V = 
{1, . . . , n} represents the nodes of a distributed system. With respect to communication, we adhere 
to the synchronous congest model as described in [15] on the complete graph on the node set V, 
i.e., in each computational round, each node may send (potentially different) O(logn) bits to each 
other node. We do not consider the amount of computation performed by each node, however, 
for all our algorithms it will be polynomially bounded. Instead, we measure complexity in the 
number of rounds until an algorithm terminates. 3 Let G = (V, E) be an arbitrary graph on the 
same vertex set, representing the computational problem at hand. Initially, every node i E V has 
the list Mi := { j £ V \ {i,j} G E} of its neighbors in G, but no further knowledge of G. 

The computational problem we are going to consider throughout this paper is the following: 
Given a graph on d £ 0(1) vertices, we wish to discover whether is a subgraph of G. 

3 Deterministic Algorithms for General Graphs 

During our exposition, we will discuss the issues of what to communicate and how to communicate 
it separately. That is, given sets of 0(logra)-sized messages at all nodes satisfying certain prop- 
erties, we provide subroutines that deliver all messages quickly, and use these subroutines in our 
algorithms. We start out by giving a very efficient deterministic scheme provided that origins and 
destinations of all messages are initially known to all nodes. We then will show that this scheme 
can be utilized to find all triangles or other constant-sized subgraphs in sublinear time. 

3.1 Full-Knowledge Message Passing 

For a certain limited family of algorithms that we call oblivious algorithms, it is possible to exploit 
the full capacity of the communication system, i.e., provided that no node sends or receives more 
than n messages, all messages can be delivered in two rounds. 

Definition 3.1. A distributed algorithm A in our model is said to be oblivious if the sources and 
destinations of all messages are determined in advance, regardless of the input graph G, and each 
source can determine the content of its messages from its input. 

Take for example an algorithm in which every node i sends every node j a bit string stating 
for each node k whether it is a neighbor of i. This algorithm is clearly oblivious and results in all 
nodes having complete knowledge of the structure of G. 

As communication is peer-to-peer, sending messages to different nodes can be executed in 
parallel. If all nodes execute the above suggested routine, after n rounds every node gets all lists 

3 Note that it is trivial to make all nodes terminate in the same round due to the full connectivity. 
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Algorithm 1 Deterministic Message Passing at node i holding message set S. 

S' := 0, S" := 

/ / first stage (distribution) 

for irij^k G S do 

send rrij^ to node k 
for received message m do 

S' := S' U {m} 
1 1 second stage (delivery) 
for rrij^ £ S' do 

send rrij t k to node j 
for received message m do 

S" := S" U {m} 
return S" 



of immediate neighbors of nodes, and can therefore reconstruct the graph locally. We will see later 
on, in Section 4, that a similar algorithm can be realized more efficiently using a more evolved 
communication strategy. 

We now turn to describing our communication pattern for oblivious algorithms. To this end, 
we will need the following claim that is a corollary of Hall's marriage theorem. 

Claim 3.2. Every d-regular bipartite multigraph is a disjoint union of d perfect matchings. 

Proof. By induction on d. For d = 1 the graph is a perfect matching by definition. 

Assume that the claim holds for some d, and let H = (L, R, E) be a (d + l)-regular bipartite 
graph. Let S C L be some set of vertices, and define T(S) := {u G R : 3v G S s.t. (v,u) G E}. 
By regularity, the sum of degrees in S is exactly (d + 1)|5|, and by the pigeonhole principle and 
regularity \T(S)\ > (d+ l)\S\/(d+ 1) = satisfying Hall's marriage condition thus implying that 
a perfect matching exists. Removing the perfect matching found from the graph leaves a <i-regular 
bipartite graph that is a disjoint union of d perfect matchings by the induction hypothesis. Adding 
those d perfect matchings to the one just obtained completes the proof. □ 

Lemma 3.3. Given a bulk of messages, such that: 

1. The source and destination of each message is known in advance to all nodes, and each source 
knows the contents of the messages to sent. 

2. No node is the source of more than n messages. 

3. No node is the destination of more than n messages. 

A routing scheme to deliver all messages within 2 rounds can be found efficiently. 

Proof. WLOG we assume every node is the source of exactly n messages, and it is the destination 
of exactly n messages as well (having a node "sending message to itself" is not a problem). We 
will label every message to node i with a different j G {1, n} and denote the messages to node i 
according to this labeling by 771^1,777,^2, ■■■,fni,n- 

We define a good labeling to be such that no node initially holds two messages labeled rrij k 
and 777^ for some l,j, k with / 7^ j. Assuming we start with a good labeling, we argue that the 
message passing algorithm whose pseud-code is given in Algorithm 1 terminates successfully after 
two rounds. We will later show that a good labeling is always attainable. 
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If our labeling is indeed good, then during the first stage every node sends at most a single 
message to each of the other nodes, and therefore can dispose of all the messages in S within the first 
round. Due to the unique labeling of the messages, after the first stage node i holds all messages 
of type rrikj, and since there is at most one such message for each k, all of them are emitted within 
a single round in the second stage. Clearly, the labeling also ensures that the returned set S" will 
contain exactly the messages whose destination is i. 

It remains to show that we can find a good labeling. Recall that sources and destinations are 
known in advance to all nodes, so each node can compute the labeling locally. If all nodes use the 
same deterministic algorithm, this will result in all nodes using the exact same labeling. 

Let B = (L,R,E) be a bipartite multigraph, where \L\ = \R\ = n. We denote L = {Zi, ...,l n } 
and R = {r\, r n }. For every message in the initial bulk with source i and destination k we add 
an edge (Zj, r^) to E. B is clearly an n-regular multigraph, and by Claim 3.2 it is a disjoint union of 
n perfect matchings. We now choose a perfect matching in this graph, remove its edges and label 
the messages represented by those edges thus: for every edge (h,rk) in the matching we label its 
corresponding message mfci. After removing those edges we find another perfect matching, and 
for every edge (k, r^) in it we label the corresponding message m^^, and so on, until we remove the 
nth perfect matching from the graph. Since a perfect matching is easy to find (using maximal-flow 
algorithms), a good labeling can be found efficiently. □ 

Corollary 3.4. An oblivious algorithm in which each node sends and receives at most T{n) mes- 
sages can be completed within 2[~T(re)/n] rounds, by repeatedly using the message passing routine 
described above. 

3.2 TriPartition - Finding triangles deterministically 

Next, we present an algorithm that finds whether there are triangles in G. The algorithm is not 
oblivious, since in the final step every node broadcasts whether it found a triangle or not to all 
other nodes. This last broadcast message is obviously dependent on other messages transferred 
throughout the algorithm, therefore it violates the obliviousness requirement that the order of 
the messages will not matter. However, having every node broadcast its results takes a single 
round only. The first part of the algorithm is oblivious, allowing us to apply the message passing 
algorithm previously stated to it. As the oblivious part of the algorithm terminates, we run the 
final broadcasting round. 

Let S C 2 V be a partition of V into equally sized subsets of cardinality n 2//3 . We write S = 
{Si, S^i/3}. To each node i £ V we assign a distinct triplet from S denoted S^i, 5^2, S*i,3 (where 
repetitions are admitted). Clearly, for any subset of three nodes there is a triplet such that each 
node is element of one of the subsets in the triplet, showing the following claim. 

Claim 3.5. For each triangle {^1,^2^3} G, there is some node i such that t\ £ Six, ti 6 5^2, 
and t% £ 5^3. 

Proof. Each node checks for triangles that are contained in its triplet of subsets by executing 
TriPartition, whose pseudo-code is given in Algorithm 2. □ 

Theorem 3.6. TriPartition determines correctly whether there exists a triangle in G and can be 
implemented within 0(n 1//3 ) rounds. 
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Algorithm 2 TriPartition at node i. 



Ei :=0 

for 1 < j < k < 3 do 
for I G Si j do 

retrieve jVJ n 
for m ejVjII S^fe do 
| := U {1, m} 
if £/iere exists a triangle in Gi := (V, then 
send "triangle" to all nodes 

if received "triangle" from some node then 
return true 

else 

return false 



Proof. Correctness follows from Claim 3.5, as node i collects exactly the edges between pairs of 
subsets in its triplet. The round complexity is deduced as follows. Since the assignment of set 
triplets is static, each node i knows which nodes need to learn about which of its neighbors. Since 
there are n 1 / 3 subsets of size n 2 / 3 , each of which participates in ra 1 / 3 triplets involving the subset 
containing i, the node needs to transmit at most n 4//3 messages. 4 On the other hand, each node 
needs to learn about less than (g) 71 - 4 ^ 3 edges, one for each pair of nodes from two of its subsets. By 
Corollary 3.4, this information can thus be communicated within 0(n 1 /^) rounds. The algorithm 
terminates one additional round later, completing the proof. □ 

Remark 3.7. The fact that (except for the potential final broadcast) the entire communication 
pattern of TriPartition is predefined enables to refrain from including any node identifiers into the 
messages. That is, instead of encoding the respective sublist of neighbors by listing their identifiers, 
nodes just send a — 1 array of bits indicating whether a node from the respective set from S is or 
is not a neighbor in G. The receiving node can decode the message because it is already known in 
advance which bit stands for which pair of nodes. We may hence improve the round complexity of 
TriPartition to 0(n 1//3 /logn). 

3.3 Generalization for rf-cliques 

TriPartition generalizes easily to an algorithm we call dCliqueO that finds d-cliques (as well as 
any other subgraph on d- vertices). We choose S to be a partition of V into equal size subsets of 
cardinality re^ -1 )/^, resulting in S = {Si, S n i/d}. Each node now examines the edges between 
all pairs of some d-sized multisubset of S (as we did for d = 3 in TriPartition). Since there are 
exactly \S\ d = n such multisets, all possible <i-cliques are examined. Every node needs to receive 
the list of edges for all (2) pairs, each containing at most (rS d ~' i -)/ d ) 2 edges, thus every node needs 
to send and receive at most 0(rS 2d ~ 2 ^ d ) messages. 

Theorem 3.8. dCliqueO determines correctly whether there exists a d-clique (or any given d-vertex 
graph) in G within 0(n( d ~ 2 ^ d / logn) rounds. 

Proof. Similarly to the 3-vertex case, we apply Corollary 3.4, and due to obliviousness, we may 
assume all messages are sequences of bits as in Remark 3.7. □ 

4 Clearly, a neighbor can be encoded using logn bits. 
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Algorithm 3 TriNeighbors at node i. 



for j G V s.t. G E do 
retrieve Mj 
for k G Afj do 

:= Ei U {j, A:} 
if there exists a triangle in Gi := (V, Ei) then 
send "triangle" to all nodes 

if received "triangle" from some node then 
return true 

else 

return false 



4 Finding triangles in sparse graphs 

In graphs that have o(n 2 ) edges, one might hope to obtain faster algorithms. However, the algo- 
rithms from the previous section have congestion at the node level, i.e., even if there are few edges 
in total, some nodes may still have to send or receive lots of messages. Hence, we need different 
strategies for sparse graphs. In this section, we derive bounds depending on parameters that reflect 
the sparsity of graphs. 

4.1 Bounded Degree 

We start with a simple value, the maximum degree A := maxj g y<5j, where the degree of node 
i 5i := \Mi\. If A is relatively small, then every node can simply exchange its neighbors list 
with all its neighbors. We refer to this as TriNeighbors algorithm, whose pseudo-code is given in 
Algorithm 3. In a graph with bounded A, it will be much faster than dCliqueO algorithm. 

We use an elegant message-passing technique, suggested to us by Shiri Chechik [3]. Assuming 
that (i) no node is the source of more than n messages in total, (ii) no node is the destination of more 
than n messages, and (iii) every node sends the exact same messages to all of the destinations for its 
messages, it delivers all messages in 3 rounds. This is done by first having each node distribute its 
messages evenly, in a Round-Robin fashion, to all other nodes in the graph. In the second phase, 
messages are retrieved in a similar Round-Robin process. This divides the communication load 
evenly, resulting in an optimal round complexity. Assuming that for each node i we have the set 
of its k(i) messages Mi = {m^i, . . . , m^/^}, let Di denote its recipients list. With these notations, 
Chechik's Round-Robin Messaging algorithm is given in Algorithm 4. 

Lemma 4.1. Given a bulk of messages in which: 

1. Every node is the source of at most n messages. 

2. Every node is the destination of at most n messages. 

3. Every source node sends exactly the same information to all of its destination nodes and 
knows the content of its messages. 

Round-Robin-Messaging delivers all messages in 3 rounds. 

Proof. In the first loop of the algorithm every node sends one message to every other node; note 
that it is feasible to send both the message Wijmodfc(z) anc ^ a potential notification at the same 
time. The cyclic nature of the message distribution in this first loop assures that any consecutive 
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Algorithm 4 Round-Robin-Messaging at node i. 
R := // collects output 

S := // collects source nodes and ^messages for i 
for j G V do 

send m i)imodfe(i) to j 

if j G Di then 

send "notify to j 
for "notify k(j) " received from j do 

S:=Su(j,k(j)) 
I := 1 

for (j, fc(j)) G 5 do 

for k G {1, . . . , fc(j)} do 

send "request message from j" to I 
l:=l + l 

for received "request message from j " do 
send m iiimodfe(j) to j 

for received message m do 

R:=RU{m} 
return R 



k{i) nodes together hold all k{i) messages of node i, exactly one at each node. By Condition 1, 
k(i) < n for each node i, i.e., each node indeed sends out all its messages. By Condition 2, for 
each node the querying loop will request at most one message from each node. Since exactly k(j) 
messages are requested from a node j, the set of messages retrieved in the second last loop contains 
My By Condition 3 and due to the previous notification of destination nodes, this is exactly the 
set of messages to be received from j. This shows correctness of the algorithm. As we also argued 
that in total three communication rounds are required, this shows the statement of the lemma. □ 

Corollary 4.2. Using Round-Robin-Messaging, the complete structure of the graph can be known 
to all nodes in 0\\E\/n\ rounds. 

Algorithm TriNeighbors satisfies all the conditions of Lemma 4.1. We conclude that, employing 
Round-Robin-Messaging, the round complexity of TriNeighbors becomes 0([~A 2 /n~|). If A G 0(y/n) 
then the round complexity is 0(1), and clearly optimal. More generally, any subgraph of diameter 5 
D G 0(1) can be detected by each node exploring its D-hop neighborhood. 

Corollary 4.3. We can test for subgraphs of diameter D in 0(\A D+l /n\) rounds. 
4.2 Bounded Arboricity 

The arboricity A of G is defined to be the minimum number of forests on V such that their 
union is G. Note that always A < A, and for many graphs A <C A. The arboricity bounds the 
number of edges in any subgraph of G in terms of its nodes. We exploit this property to devise an 
arboricity-based algorithm for triangle finding that we call TriArbor. 

5 The diameter of the graph is the maximum shortest path length over all pairs of nodes. 
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4.2.1 An overview of the TriArbor algorithm 



We wish to employ the same strategy used by the naive TriNeighbors, that is "asking neighbors 
for their neighbors" , in a more careful manner, so as to avoid having high degree nodes send their 
entire neighbor list to many nodes. This is achieved by having all nodes with degree at most 4A 
send their neighbor list to their neighbors and then shut down. In the next iteration, the nodes that 
have a degree at most 4A in the graph induced by the still active nodes do the same and shut down. 
As 2An' uniformly bounds the sum of degrees of any subgraph of G containing n' nodes, in each 
iteration at least half of the remaining nodes is shut down. Hence, the algorithm will terminate 
within O(logn) iterations. In order to control the number of messages sent in each iteration, we 
consider triangles involving at least one node of low degree (in the induced subgraph of the still 
active nodes). As we will find a triangle once any of its nodes' degrees becomes smaller than 4A, 
all triangles are will be detected. 

Obviously, no node of low degree will have to send more than 4A messages in this scheme. 
However, it may be the case that a node receives more than 4A messages in case it has many 
low-degree neighbors. To remedy that, low-degree nodes avoid sending their neighbor list to their 
high-degree neighbors directly, and instead send them to intermediate nodes we call delegates. The 
delegates share the load of testing their associated high-degree node's neighborhood for triangles 
involving a low-degree node. 

Note that in the presented form, the algorithm is not uniform, i.e., it is assumed that A is known. 
We will later discuss how to remove this assumption and slightly improve its round complexity at 
the same time. 

4.2.2 TriArbor algorithm 
Choosing delegates 

In each iteration, every delegate node will be assigned to a unique high-degree node, i.e., a node of 
degree larger than 4A in the subgraph induced by the nodes that are still active. In the following, 
we will discuss a single iteration of the algorithm. Denote by G' := (V',E') some subgraph of G 
on n' nodes, where WLOG V' = {1, . . . , n'}. Define 5', b A', J\f(, etc. analogously to the respective 
values without a prime, but with respect to G' instead of G. We would like to assign to each node 
i exactly \8' i / (4A)] delegates such that each delegate is responsible for up to 4A of the respective 
high-degree node's neighbors. 

Claim 4.4. At least n' /2 of the nodes have degree at most 4A and the number of assigned delegates 
is bounded by n' . 

Proof. We have that 




Therefore, 




8[>AA 

i.e., less than n' delegates are required. 



□ 
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Algorithm 5 One iteration of TriArbor at node i. 



j I compute delegates 
send 5[ to all other nodes 

compute assignment of delegates to high-degree nodes and neighbor sublists 
/ / high-degree nodes distribute their neighborhood 
if 6'i > AA then 

partition A/J into [~(5-/4^4] lists of length at most AA 

send each sublist to the computed delegate 

for j G N'i do 

notify j of the delegate assigned to it // only i knows the order of J\f(, hence communication 
required 

/ / let all delegates learn about A/"j 

if i is delegate of some node j then 

denote by Dj the set of delegates of j 

denote by Ljj C J\fj the sublist of neighbors received from j 

for k € Dj do 

send Lj t i to k 
for received sublist Lj ^ do 
A/]:=A/]uL, ife 
/ / low-degree nodes distribute their neighborhoods 
if S't < AA then 
for j G N[ do 

if S'j < 4A then 

j send N[ to j // low-degree nodes can handle load themselves 
else 

j send A/J to the delegate of j assigned to i 
j j check for triangles 
for received A/"j (from j with S'j < 4A ) do 
if Nl n M) t^0 then 

send "triangle found" to all nodes / / detected triangles involving two low-degree nodes 
else if i is delegate of k and N'j Pi N' k / then 

send "triangle found" to all nodes / / detected triangle involving one low-degree node 
if received "triangle found" then 
return true 

else 

return false 



Moreover, the assignment of delegates to high-degree nodes can be computed locally using a 
predetermined function of the degrees 5-. Thus, if every node communicates its degree all nodes 
can determine locally the assignment of delegates to high-degree nodes in a consistent manner. 

The algorithm 

Algorithm 5 shows the pseudocode of one iteration of TriArbor. The complete algorithm iterates 
until for all nodes 8^ = and outputs "true" if in one of the iterations a triangle was detected and 
"false" otherwise. 
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Claim 4.5. TriArbor terminates within [logn] iterations. 

Proof. Follows directly from Claim 4.4, as in each iteration at least half of the nodes are eliminated. 

□ 

Lemma 4.6. TriArbor correctly decides whether the graph contains a triangle or not. 

Proof. Clearly, there are no false positives, as in each iteration, nodes will only claim that a triangle 
is found if they learned about an edge connecting two nodes in the same neighborhood (either their 
own or the node whose delegate they are). 

Recall that by Claim 4.4, there are sufficiently many delegates available, and we observed that 
the assignment can be computed as the same function of the (current) degrees. 

Now, assume the graph contains some triangle {11,12,13}- There must be some iteration in 
which one of the nodes, say i\, has degree b\ < 4A and the triangle is still in the subgraph induced 
by active nodes: By Claim 4.5, eventually all nodes get eliminated, while each edge connecting 
two high-degree nodes will still be present in the subgraph induced by the active nodes of the next 
iteration. 

We distinguish two cases. If in the respective iteration it also holds that 5' i2 < AA, then i\ will 
send %2 its neighbor list (with respect to the induced subgraph), and 12 will detect the triangle. 
Otherwise, i\ will send its current neighbor list to one of 12$ delegates. As 12 splits its neighbor 
list and distributes it among its delegates, which share their sublist with all other delegates, this 
delegate will detect the triangle. Hence, in both cases, the triangle will eventually be discovered, 
this information be spread among the nodes, and all nodes will compute the correct output. □ 



Round complexity of TriArbor 

We examine the time complexity of one iteration of the algorithm. Obviously, announcing degrees 
takes a single round only. 

Claim 4.7. The distribution of high- degree nodes' neighborhoods can be performed in two rounds. 

Proof. Every node i with 6^ > AA partitions its neighbor list and sends it, totalling in at most 
5'i < n' messages. As each node is delegate of at most one node, no more than 4A messages need 
to be received. Observe that since all nodes are aware of the assignment of delegates as well as all 
node degrees, we can apply Lemma 3.3 to see that all messages can be delivered in two rounds. 
Notifying neighbors of their assigned delegates takes one message for each neighbor. However, both 
tasks are independent, therefore we can merge the respective messages, resulting in a total of two 
rounds. □ 

Claim 4.8. Exchanging neighborhood sublists between delegates can be implemented in four rounds. 

Proof. Every delegate holds a sublist of at most 4A of the neighbors of the node i it has been 
assigned to. Hence, it needs to send at most [5-/4yl]4A < 25' { < 2n' messages. Similarly, it receives 
less than 2n' messages. As delegates are aware of the number of messages to exchange, Lemma 3.3 
shows that we can implement this communication in four rounds. □ 

Claim 4.9. The distribution of low-degree nodes' neighborhoods can be performed in 3|"32A 2 /n] 
rounds. 
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Algorithm 6 QuickDecomposition at node i. 
V := V 

for [logn] iterations do 

send 5[ := \Mi D V | to all nodes 
V' := V'\{j G V'lS'j < 4A} 



Proof. Every node i with 5^ < AA sends 5[ messages to each of its low-degree neighbors and to 
one delegate of each high-degree neighbor, i.e., at most 1QA 2 messages. Similarly, both low-degree 
nodes and delegates receive at most 16A 2 messages. As the low-degree nodes send their entire 
neighborhood to all destinations, applying Lemma 4.1 repeatedly yields that this communication 
can be performed in 3[32^4 2 /n] rounds (note that nodes may have to receive 32A 2 /n messages 
because they may have low degree and be delegate at the same time). □ 

Finally, announcing a found triangle takes one more round. All in all, we get the following 
result. 

Theorem 4.10. Algorithm TriArbor is correct. Using our Deterministic Message Passing and 
Round-Robin-Messaging algorithms, it can be implemented with a running time of 0{\ A 2 /n\ logn) 
rounds. 

Proof. Correctness was shown in Lemma 4.6. Combining Claims 4.7, 4.8, and 4.9, we see that a 
single iteration of the algorithm can be implemented with running time 0{\A 2 /n\). By Claim 4.5, 
the total running time is thus bounded by 0{\A 2 /n \ logn) rounds. □ 

Corollary 4.11. The iterations of TriArbor can be parallelized, reducing the round complexity to 
0(A 2 /n + logn). 

Proof. We first let all nodes execute the a short announcement phase, whose pseudo-code is given 
in Algorithm 6, storing all received values. 

The aim of this "announcement phase" is that for all iterations, the nodes will know in advance 
which nodes are of high degree, which are of low degree, and which nodes are the delegates of which 
other nodes. As all this information can be inferred from the degree distributions at the beginning 
of each iteration, which by itself is also a function of the degrees in the previous iteration, the above 
routine performs this task. 

Our goal is now to show that we can "merge" the further communication of all iterations such 
that the total running time is bounded by 0{\A 2 /n\). Note that nodes satisfy up to three roles 
during the execution of the algorithm: they may act as (i) high-degree nodes, (ii) delegates, and (iii) 
low-degree nodes. However, according to Claim 4.4, during the entire execution of the algorithm, 
the total number of delegates is bounded by 

oo 

En 
—— = In. 

i=l 

We conclude that we can assign delegates in a way such that each node acts as delegate in at most 
two iterations. Furthermore, each node is a low-degree node in exactly one iteration, as afterwards 
it is eliminated from the subgraph induced by active nodes. Therefore, the asymptotic bounds from 
Claims Claim 4.8 and Claim 4.9 can be shown analogously also for the merged execution. Regarding 
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Claim 4.7 observe that since the number of active nodes decreases exponentially, no node sends more 
than 2n messages in its role as high-degree node during the course of the algorithm. Overall, we 
obtain the same asymptotic running time bound of 0(\A 2 /n\ ) for the communication performed by 
all iterations of the algorithm as we did before for a single one. Adding the initial 0(logn) rounds 
for determining the active nodes in each iteration, the claimed running time bound follows. □ 

Furthermore, we can utilize the "excess capacity" of the communication system in case A 2 <C n 
to further reduce the number of iterations. 

Corollary 4.12. TriArbor can be modified to run in 0(A 2 /n + log 2+n /^2 n) rounds. 

Proof. Instead of choosing the threshold for low-degree nodes to be 4A, we pick max{4j4, [-^/n]}. 
If 4^4 > \\/n\ the algorithm behaves as before. Otherwise, we have that in each iteration at most 



2An' 

< 



l£V v 

remain active, implying that all nodes are eliminated in 0(log 2+n /A 2 n ) rounds. 

It remains to show that if \\/n\ > AA, all iterations can be executed in parallel in 0(1) rounds. 
Observe that Claims 4.7 and 4.8 hold for any choice of the threshold. Hence, as the number of nodes 
decreases exponentially also if \\/n\ > 4^4, the distribution of high-degree nodes' neighborhoods 
and the communication among delegates can be performed in 0(1) rounds in total. Regarding the 
messages sent by low-degree nodes, in total less than [-v/n] 2 < 2n (instead of \QA 2 ) messages need 
to be conveyed, and each delegate receives at most \\fn\ 2 <2n messages. As each node is delegate 
at most twice, this requires 0(1) rounds as well. Hence, taking into account Theorem 4.10 and 
Corollary 4.11, the statement follows. □ 

It remains to remove the dependence of the algorithm on knowledge on A. 

Corollary 4.13. A variant of TriArbor can be executed successfully in 0(A 2 /n + \og2+ n /A 2 n ) 
rounds with no prior knowledge of A. 

Proof. Denote by 6' := (Y^ieV &i)/ n ' the average degree of the graph of currently active nodes G' . 
Instead of setting the threshold for high-degree nodes to max{4j4, [\/n]} as in Corollary 4.12, we 
pick max{2(5', [-^/n]}. We have that 

J_ = — 

25' ^ { 2 ' 

i.e., still at least half of the active nodes are eliminated in each iteration. Moreover, 

^' = E ^ 4 ^ 

n 

ieV 

hence, arguing analogously to Corollary 4.11, we can perform all iterations together in 0([A 2 /n]) 
rounds. □ 

Remark 4.14. In [4j it is shown that, for any graph, A G 0{\/\E\~+n) . Plugging this bound 
into the running time guaranteed by Corollary 4-13 yields the observation that the round complexity 
achieved by TriArbor is always in 0(\E\/n + logn) ; that is up to an additive logarithm the best we 
have shown so far (recall Claim 4-4 yields (D(\E\/n)). 
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5 Randomization 



Our randomized algorithm does not exhibit an as well-structured communication pattern as the 
presented deterministic solutions, hence it is difficult to efficiently organize the exchange of infor- 
mation by means of a deterministic subroutine. Therefore, we make use of a randomized routine 
from [10]. 

Theorem 5.1 ([10]). Given a bulk of messages such that: 

1. No node is the source of more than n messages. 

2. No node is the destination of more than n messages. 

3. Each source knows the content of its messages. 

For any predefined constant c > 0, all messages can be delivered in 0(1) rounds with high probability 
(w.h.p.), i.e., with probability at least 1 — l/n c . 

We try to give some intuition on why this theorem is true. A key idea is that, using random- 
ization, it is possible to first distribute a fairly large fraction of the messages in a roughly balanced 
manner, i.e., such that n — o{n) messages for each destination can be delivered by each node sending 
at most 0(1) messages to the respective destination. Subsequently, we can make "more effort" to 
distribute the remaining o(n 2 ) messages evenly. To this end, these messages are duplicated and 
sent redundantly to different randomly chosen relay nodes. The number of copies is limited in order 
to not overload the network. This results in an exponentially amplified probability to succeed in 
delivering each message. Hence, after one iteration of this scheme, we will have much less messages 
to deliver, enabling to increase the number of redundant copies used for each message further, and 
so on. Repeated application results in delivery of all messages in 0(1) rounds. 

5.1 The algorithm 

When sampling randomly for triangles, we would like to use the available information as efficiently 
as possible. To this end, on the first iteration of Algorithm 7 all nodes sample randomly chosen 
induced subgraphs of a certain size and examine them for triangles. On subsequent iterations 
the size of the checked subgraphs is increased. Checking a subgraph of size s requires to learn 
about 0(s 2 ) edges, while it tests for 0(s 3 ) potential triangles. If s E Q(^/n), it thus takes a 
linear number of messages to collect the induced subgraph at some node and test for triangles. 
Using the subroutine from Theorem 5.1, each node can sample such a graph in parallel in 0(1) 
rounds. Intuitively, this means to sample 0(n 5 / 2 ) subsets of three vertices in constant time. As 
I (3) I G 0(n 3 ), one therefore can expect to find a triangle quickly if at least Q.{^/n) triangles are 
present in G. If less triangles are in the graph, we need to sample more. In order to do this 
efficiently, it makes sense to increase s instead of just reiterating the routine with the same set size: 
The time complexity grows quadratically, whereas the number of sampled 3-vertex-subsets grows 
cubically. Finally, once the running time of an iteration hits n 1 / 3 , we will switch to deterministic 
searching to guarantee termination within 0(n 1 ^ 3 ) rounds. Interestingly, the set size of s = n 2 / 3 
corresponding to this running time ensures that even a single triangle in the graph is found with 
constant probability. 

5.2 Round complexity 

Our first observation is that the last iteration dominates the round complexity of the algorithm. 
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Algorithm 7 TriSample at node i. 



fn while s < n 1 / 3 do 
choose a uniformly random subset of s nodes G 
for j G Ci do 

send the member list of Cj to j 
for received member list Cj from j do 
send Mi n Cj to j 

#i := 

for received Mj D Cj /rom j do 
for fc eMjCi Cj do 
| £j := ^ U {j, fe} 
if Gj := (V, Ei) contains a triangle then 
send "triangle found" to all nodes 

if received "triangle found" then 
return true 

else 

s:=2s 

run TriPartition and return its output / / switch to deterministic strategy 



Lemma 5.2. // TriSample terminates after m iterations, the round complexity is in 0(2 2m ) with 
high probability. 

Proof. Let Sk denote s in the k th iteration, hence s\ = \^n,S2 = 2y / n, ...,s m = 2 m_1 y / n. Clearly, 
in the k th iteration, every node i sends out exactly si messages to nodes j informing them about 
Cj. Since the sets Ci are chosen independently, by Chernoff's bound with high probability every 
node j in the k th iteration is a member of 0(s&) sets Cj, and therefore receives O(sk) such subsets. 
It follows that, w.h.p., it will respond with in total at most 0(s\) messages telling the respective 
nodes i about Mj D Cj. The recipients of this messages will have to bear a load of at most |Cj| 2 = 
s\. By Theorem 5.1, these message exchanges may be accomplished in 0{\s\/n\) rounds w.h.p. 
If the algorithm terminates after the m th iteration, the overall round complexity is therefore in 

o(\ZT=i4M) = o(\slM). ' ' □ 

Corollary 5.3. If TriSample is guaranteed to find a triangle with probability 1 — e/2 once s 
passes some threshold s(e), then the round complexity to find a triangle with probability 1 — e 
is 0(\s(e) 2 jn\ ) w.h.p. 

Proof. By Lemma 5.2 and the union bound. □ 

Remark 5.4. Note that s m < n 2 / 3 by the loop condition and afterwards the algorithm simply 
executes TriPartition. The round complexity is therefore in 0{n 1 /^) with high probability. 

5.2.1 Proof overview 

Our aim is to bound the number of iterations needed to detect a triangle with probability at least 
1 — e, as a function of the number of triangles in the graph. Let T C (^) denote the set of triangles 
in G, where \T\=t. 

On an intuitive level, the triangles are either scattered (i.e., rarely share edges) or clustered. 
If the triangles are scattered, then applying the inclusion-exclusion principle of the second order 
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will give us a sufficiently strong bound on the probability of success. If the triangles are clustered, 
then there exists an edge that many of them share. Finding that specific edge is more likely than 
finding any specific triangle, and given this edge is found, the probability to find at least one of the 
triangles it participates in is large. 



5.2.2 Bounding the probability of success using the inclusion-exclusion principle 

We know by the inclusion-exclusion principle that 

Pr[a triangle is found] > t-Pr [exactly one triangle is found]— Pr [at least a and b are found]. 

For every a ^ b £ T there are three cases to consider: 

1. a and b are disjoint, that is a n b = 0. 

2. a and b share a single vertex, \a PI b\ = 1. 

3. a and b share an edge, \a Pi b\ = 2. 

Observe that for every constant r and set of vertices Vq s.t. \Vq\ =r, it holds that: 

(%/ (n — s m + r)) r < Pr Vq is chosen in the m th iteration < (s m / (n — s m )) r . (1) 

Definition 5.5. T r G C^) is the set of pairs of distinct triangles in G that have together exactly r 
vertices. Denoting t r = \T r \, clearly t$ + 1$ + t§ = Q) = |(^)|- 

Define 

P m = Pr [triangle found in iteration m] 

and 

p m = i-V[node i found triangle in iteration m]. 
For symmetry reasons the latter probability is the same for each node i. 
Claim 5.6. For < e < 2, if p m > ln(2/e)/n then P m > 1 - e/2. 

Proof. Recall that each node i chooses Cj independently. Consequently, the probability of no 
triangle being found in the m th iteration is at most 

Ln(2/E)\ [[^ ! \ \ < in(2/e) = £ 



n J \\ n\rr l (2/e)J I 2 

□ 

With the above notations, we combine Equality (1) with the inclusion-exclusion principle to 
obtain: 

3 6 x N k 

Srr 



Recall that we distinguish between the cases of "scattered" and "clustered" triangles. We now 
give these expressions a formal meaning by defining a threshold for t± in terms of t and a critical 
value s(e) of s m that is s(e) := max{2n 2//3 t~ 1 / 3 ln 1 / 3 (2/e), 2^Jn ln(2/e)}. The critical value stems 
from either of the following cases: 
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1. Scattered triangles - we wish to sample as many triangles as possible, and the number of 
triangles sampled grows cubically in s m . The n 2 / 3 factor in the numerator reflects the fact 
that s m = n 2 / 3 would imply that each triangle is sampled with constant probability. 6 Clearly 
having a lot of triangles in general improves the probability of success, hence the division by 

i-V3. 

2. Clustered triangles - it may be the case that all triangles share a single edge, hence we must 
sample this edge with probability at least 1 — e/2. For s m = yjn each node samples 0(n) 
edges, hence each edge is sampled with constant probability. 

5.2.3 Scattered triangles 

Assume t± < tn/(2s(e)). 

Lemma 5.7. If < tn/(2s(e)) and n is sufficiently large, then a triangle will be found with 
probability at least 1 — e/2 in any iteration where s m > s(e). 

Proof. We rewrite Equality (2) as 

^ ' ^miP 1 ^rri) t\ • s m • (n s m ) t§s m • in Sm) tQS m 



Pm > 



(n-s m ) 6 

Due to the loop condition in TriSample, s m < n 2//3 E (1 — o(l))n, therefore 

(1 — o(l))ts m n 3 — t^s^n 2 — t^s^n — tes m _ s 3 n ((l — o(l))tn 3 — tiS m n 2 — t^s^n — t^s 3 



Pm > 



n 6 n 6 



As ti < tn/(2s(e)) < tn/{2s m ), this can be estimated further by 

. *m((i " o(l))tn 3 - t 5 s 2 m n - t 6 s 3 m ) 

Pm > g • 

By definition, tj + 1§ + ig = (*) < t 2 /2, therefore there exist /3,7 > such that £5 = (3t 2 ,tQ = 7* 2 
and j3 + 7 < 1/2. Using this notation, 



Pm > 



4jX| - o(l))^ 3 - f3t 2 s 2 m n - -ft 2 s 3 m ) = s 3 m t((l - o(l))n 3 - f3ts 2 m n - 7 ts 3 ra ) 
n 6 n 6 



By the loop condition in TriSample, s m < n 2 / 3 . Recalling that we assume t 6 o(n 2 / 3 ), this 
becomes 

i&t((A - o(l))n 3 - ptni - 7 tn 2 ) s 3 m t(\ - o{l))n 3 
Pm > e > e • 

Given that s m > s(e) > 2n 2 / 3 ln 1 / 3 (2/e)/t 1//3 and, we have for sufficiently large n that 

s 3 m t{\ - o(l))n 3 _ (1-0(1))^ 2in 2 ln(2/e) _ 21n(2/e) 



n 6 n 3 in 3 n 



By Claim 5.6 this implies that the probability of finding a triangle in iteration m is at least 
l-e/2. □ 



^Observe that TriPartition samples exactly n ' vertices per node in a way covering all subsets of 3 nodes. 
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5.2.4 Clustered triangles 

Assume t± > t ■ n(2 • s m ). The strategy employed here is to show that due to the bound on t±, there 
exists an edge shared by many triangles. Subsequently the analysis focuses on this edge, showing 
that the probability to sample this edge and find a triangle containing it is sufficiently large. 

Definition 5.8. For each edge e G E, define A e = |{Tj : e C Tj}|. In other words, A e is the number 
of triangles that e participates in. Denote A max = max eg g A e . 

Lemma 5.9. A max > 2*4/31 

Proof. Consider a figure consisting of two triangles sharing an edge (this is basically K4 with one 
edge removed). We count the occurrences of this figure in G in two different ways: 

1. Observe that t\ counts just that. 

2. Pick one of the t triangles from T, choose one of its 3 edges, denote it e. Choose one of the 
other A e — 1 triangles that share e to complete the figure. Note that this counts every figure 
exactly twice, since we may pick either of the two triangles in the figure to be the first one. 
By definition A e — 1 < A max — 1, hence we count at most 3i(A max — l)/2 occurrences. 

By comparing 1. and 2. we conclude that £4 < 3i(A max — l)/2, completing the proof. □ 

Remark 5.10. The tightness of this bound can be confirmed by examining a complete graph. 

Lemma 5.11. If t± > tnj (2 • s(e)) then a triangle will be found with probability at least 1 — e/2 in 
any iteration where s m > s(s). 

Proof. Assume WLOG that e max = {x, y} is an edge shared by A max triangles. The probability of 
a node choosing both endpoints of 6 max is s m (s m — l)/(n(n — 1)) > 0.99s^/n 2 (for large values of n, 
as s m > y/n) . Given that this edge is chosen, the probability of missing all of the A max vertices that 
complete a triangle with e max is at most (1 — A max /n) Sm ~ 2 . By Lemma 5.9 and our assumption on 
£4, we deduce that A max > nj (3s m ), therefore the probability of a specific node missing all triangles 
comprising e max , conditional to e max being chosen, is at most (1 - l/(3s m )) Sm " 2 < e~ 1 / 3 /0.99 (for 
large values of n). Fixing some node i, we obtain that 

Pr [i finds a triangle with e max |x, y £ Cj\ ■ Pr[x, y G Q] 
(0.99 -e-V 3 )^ 
n 2 

s(el 
4n 2 

ln(2/£)_ 
n 

Applying Claim 5.6, we conclude P m > 1 — e/2. □ 

5.2.5 Deriving the Bound on the Round Complexity 
Definition 5.12. m(n,t,e) is the minimal integer such that s m ( ni t j£ ) > s(e). 

Corollary 5.13. Given that G contains at least t triangles, for every e > 0, with probability at 
least 1 — e/2 TriSample terminates at the latest in iteration m(n,t,e). 



Pm > 
> 

> 

> 
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Proof. Combine Lemmas 5.7 and 5.11. 



□ 



Theorem 5.14. Given that G contains at least t triangles, for every e > l/n ^ , with probability 
at least 1 — e TriSample terminates within C(min{n 1 / 3 t~ 2 / 3 In 2 / 3 e _1 + lne - n 1 / 3 }) rounds. It 
always outputs the correct result. 

Proof. By Corollary 5.13, the algorithm terminates with probability 1 — e/2 after no more than 
m(n, t, e) iterations. By Corollary 5.3, it thus terminates with probability 1 — e within 0(2 2m ( n '*' e )) 
rounds. By Remark 5.4 the round complexity is always in 0{n 1 / 3 ) with high probability, altogether 
showing the stated bound. 

Correctness follows from the fact that the algorithm terminates if it either finds a triangle, or 
after executing TriPartition, according to Theorem 3.6 with the correct output. □ 

Corollary 5.15. Algorithm TriSample terminates within 0( [n 1 / 3 /(i+l) 2 / 3 ] ) rounds in expectation 
and within ©(maxjn 1 / 3 In 2 ' 3 n/i 2 / 3 + Inn, n 1//3 }) rounds w.h.p. 

Remark 5.16. We can make sure the algorithm always terminates within 0(n 1//3 ) rounds by 
stopping it after n 1 ^ rounds and switching to TriPartition even if s m < n 2 / 3 . 

Corollary 5.17. For every e > it is possible to distinguish with probability 1—e between the 

cases that G is triangle-free and that G has at least to > 1 triangles within O(t 2 ' 3 n 1//3 ln 2//3 (l/e) + 
ln(l/e)) rounds. 

Proof. Set s < 2 max{i 1//3 2n 2 / 3 ln 1 / 3 (l/e), 2-y/n ln(l/e)} to be the loop condition TriSample. If no 
triangle has been found during the loop, we output that G is triangle-free. The running time bound 
follows from Corollary 5.3, and correctness with probability 1 — e is due to Theorem 5.14. □ 

5.3 Tightness of the Analysis 

Claim 5.18. The running time bound from Theorem 5.14 is asymptotically tight, that is, there are 
graphs for which TriSample runs with probability at least e for f^ra 1 / 3 ^ 2 / 3 In 2 / 3 e _1 ) or f2(lne -1 ) 
rounds, respectively. 

Proof. Consider a graph G with t < n — 2 triangles, all sharing a specific edge eo- To find a triangle, 
some node must sample both ends of eo, and this happens with probability s m (s m — l)/(n(n — 1)) 
per node. The probability that all nodes miss eo is at least (1 — s m (s m — l)/(n(n — l))) n . If 
s m G o(y / n ln(l/e)) then this probability is in 1 — u)(e). 

Consider a graph G with t disjoint triangles t < n/3. The probability of a specific node to miss 
a specific triangle is at least 1 — (s m (s m — l)(s m — 2))/(n(n — l)(n — 2)) > 1 — ((s m — 2)/n) 3 . By the 
union bound, the probability of a specific node missing all triangles is at least 1 — t((s m — 2)/n) 3 . 
The probability that all nodes miss all triangles is therefore at least (1 — t((s m — 2)/n) 3 ) n . Assuming 
that s m G o(r 1 / 3 n 2 / 3 ln 1/3 (l/e)), this is in (1 - o(n~ 1 ln(l/e)) n C u(e). □ 
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